Articulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings
نویسندگان
چکیده
Automatic prediction of articulatory movements from speech or text can be beneficial for many applications such as speech recognition and synthesis. A recent approach has reported stateof-the-art performance in speech-to-articulatory prediction using feed forward neural networks. In this paper, we investigate the feasibility of using bidirectional long short-term memory based recurrent neural networks (BLSTM-RNNs) in articulatory movement prediction because they have long-context trajectory modeling ability. We show on the MNGU0 dataset that BLSTM-RNN apparently outperforms feed forward networks and pushes the state-of-the-art RMSE from 0.885 mm to 0.565 mm. On the other hand, predicting articulatory information from text heavily relies on handcrafted linguistic and prosodic features, e.g., POS and TOBI labels. In this paper, we propose to use word and phone embeddings to substitute these manual features. Word/phone embedding features are automatically learned from unlabeled text data by a neural network language model. We show that word and phone embeddings can achieve comparable performance without using POS and TOBI features. More promisingly, combining the conventional full feature set with phone embedding, the lowest RMSE is achieved.
منابع مشابه
Chemlistem - chemical named entity recognition using recurrent neural networks
Chemical named entity recognition has traditionally been dominated by CRF (Conditional Random Fields)-based approaches but given the success of WKH DUWLILFLDO QHXUDO QHWZRUN WHFKQLTXHV NQRZQ DV 3GHHS OHDUQLQJ ́ Ze decided to examine them as an alternative to CRFs. We present here three systems. The first system translates the traditional CRF-based idioms into a deep learning framework, using ric...
متن کاملImproving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks
Motivation Capturing long-range interactions between structural but not sequence neighbors of proteins is a long-standing challenging problem in bioinformatics. Recently, long short-term memory (LSTM) networks have significantly improved the accuracy of speech and image classification problems by remembering useful past information in long sequential events. Here, we have implemented deep bidir...
متن کاملArticulatory Feature Extraction Using CTC to Build Articulatory Classifiers Without Forced Frame Alignments for Speech Recognition
Articulatory features provide robustness to speaker and environment variability by incorporating speech production knowledge. Pseudo articulatory features are a way of extracting articulatory features using articulatory classifiers trained from speech data. One of the major problems faced in building articulatory classifiers is the requirement of speech data aligned in terms of articulatory fea...
متن کاملPrediction of Covid-19 Prevalence and Fatality Rates in Iran Using Long Short-Term Memory Neural Network
Introduction: The rapid spread of COVID-19 has become a critical threat to the world. So far, millions of people worldwide have been infected with the disease. The Covid-19 pandemic has had significant effects on various aspects of human life. Currently, prediction of the virus's spread is essential in order to be safe and make necessary arrangements. It can help control the rate of its outbrea...
متن کاملPrediction of Covid-19 Prevalence and Fatality Rates in Iran Using Long Short-Term Memory Neural Network
Introduction: The rapid spread of COVID-19 has become a critical threat to the world. So far, millions of people worldwide have been infected with the disease. The Covid-19 pandemic has had significant effects on various aspects of human life. Currently, prediction of the virus's spread is essential in order to be safe and make necessary arrangements. It can help control the rate of its outbrea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015